Competition between folding and aggregation in a model for protein solutions 



Moumita Maiti^, Madan Rao^^'^ and Srikanth Sastry^ 
^ Theoretical Sciences Unit, JNCASR, Jakkur, Bangalore 560065, India 
^ Raman Research Institute, C.V. Raman Avenue, Bangalore 560080, India 
^National Centre for Biological Sciences (TIFR), Bellary Road, Bangalore 560065, India 

We study the thermodynamic and kinetic consequences of the competition between single-protein 
folding and protein-protein aggregation using a phenomenological model, in which the proteins 
can be in the unfolded (U), misfolded (M) or folded (F) states. The phase diagram shows the 
coexistence between a phase with aggregates of misfolded proteins and a phase of isolated proteins 
(U or F) in solution. The spinodal at low protein concentrations shows non-monotonic behavior with 
temperature, with implications for the stability of solutions of folded proteins at low temperatures. 
We follow the dynamics upon "quenching" from the U-phase (cooling) or the F-phase (heating) to 
the metastable or unstable part of the phase diagram that results in aggregation. We describe how 
interesting consequences to the distribution of aggregate size, and growth kinetics arise from the 
competition between folding and aggregation. 



Many proteins aggregate under certain conditions; 
some, such as Amyloid [3 and prion, are associated with 
debilitating and possibly fatal human diseases [l], 0| • This 
has motivated a number of biophysical studies on the na- 
ture and dynamics of aggregates at different scales [1, 
It is widely held that proteins within an aggregate are 
typically misfolded; further, that protein aggregation is 
initiated by misfolded structures. 

This immediately suggests an interplay between the 
dynamics of folding and aggregation, especially at large 
concentrations (as in the cell interior [5|), where intra- 
protein interactions compete with inter-protein interac- 
tions. Here, we explore the thermodynamic landscape 
of steady states arising from this competition, using a 
phenomenological model. A number of theoretical and 
experimental studies s ugg est the possible utility of such 
an approach [1, 0, i, i, M [O, ElU, HI, El, El EE El . 

To apply to a diverse range of proteins, our model 
needs to be reasonably generic, and therefore incorpo- 
rate only a minimal number of features common to all 
aggregating proteins. Consider N proteins of molecular 
weight L in a solvent of volume V and temperature T; 
we represent the complex folding internal-energy land- 
scape by a coarse-grained one with just three states - 
unfolded or random coil (U), a folded or native state 
(F) and a misfolded or intermediate state (M). These 
single-protein states differ in their internal energies and 
configurational entropy: U is taken to have zero internal 
energy (or defines the zero of energy) and finite entropy 
per site (InVF), F is the unique global energy minimum 
(— eo < 0), while M is often taken to be an intermediate 
energy (—em) with finite entropy per site (Inw). Note 
that the degeneracies W w ^ 0[e^). 

This single-particle picture gets modified as soon as we 
include inter-protein interactions. In general, the specific 
and nonspecific contributions to the inter-protein attrac- 
tion result in short-range, anisotropic interactions; how- 
ever to make the analysis simple, we will at present only 
consider short-range attractive interactions between pro- 
teins in the M-state, represented by a square well of range 
a and strength J . 



We work with a three-dimensional (3D) lattice-gas 
model, where a fraction p = Na^/V proteins occupy 
the sites of a cubic lattice with coordination number 
(7 = 6 (we take ct = 1). We define occupancy vari- 
ables rii = {0, 1} at each lattice site and state variables 
di = {-1(F),0(M),1(L/)} at each occupied site. The 
lattice Hamiltonian (in which we include the on-site free 
energy) is given by (setting fc^ = 1), 
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The three states are characterised by concentrations of 
the unfolded (pu), misfolded (pm) and folded (pf) pro- 
teins, with p ^ Pf + pjn + p„. It is convenient to follow 
the thermodynamic behaviour in the {T,p,pm) space, 
and write p„ = (p - Pm)x and pj ^ [p - pm)(l - x). 
We start with mean-field theory: the energy density e — 
-T\TLWpu-{f-m + Th\w) pm-eoPf-^Pm, and entropy 
density s = (1-p) ln(l-p)-|- p„ In p„ -I- p™ In p^+p/ Inp/, 
can be combined to obtain the grand potential density 
y = —P = e — Ts — pp, where p is the chemical poten- 
tial. 

Upon minimisation, we get x = y^^t^iT i and the con- 
stitutive relations. 
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Fixing p and T, these equations may be solved to obtain 
solutions for p and pm- Below T — qJ/4: = Tc, one has 
two locally stable solutions in an intermediate p range 
signaling a phase transition, with the phase coexistence 
being given by values of (p, T) for which the two solutions 
will have equal Q {i. e., the same pressure). The two 



2 



phases correspond to a low density phase where the frac- 
tion of misfolded proteins is low and a high density phase 
with a large fraction of misfolded proteins. We identify 
the latter with protein aggregation. Note that x denotes 
the fraction of proteins that are unfolded, out of those 
that are not in the misfolded state. Thus, the tempera- 
ture at which x ~ 0.5 marks a pseudo-transition point to 
the folded state. The limit of stability or spinodal lines 
are obtained by setting the determinant of the Hessian, 

i± /i 

Pm)\ = 0, are given by pm — — — — and meet 
smoothly at the critical point. The critical temperature 
ksTc = l'^., and the pseudo-transition temperature be- 
tween folded and unfolded states at low concentrations is 
keTf = 0.3257J. 

In our calculations, we assume parameters W — 
10000, w = 1000, eo = 3 J, e„ = 0.35 J. We choose the en- 
ergy scale J by using experimental values at T = 298if 
[2l| for the free energy difference between monomers in 
the U and F states (4.4 ± O.S/ccaZ/moZe), U and M states 
(1.6 ± 0.7kcal/mole (U being the stable state), and free 
energy of formation from monomers in the U state [l^] of 
trimers (lA.dkcal / mole) and tetramers [21. Gkcal /mole) . 
Assuming that trimers have 3 and tetramers 6 inter- 
actions, we obtain J to be 2.7kcal/mole. This yields 
Tf — 441if ^2^. We use a value 2.2nm, the estimated 
diameter of A/3(l-42) [2Q\, as the lattice spacing, and re- 
port densities in molar (the fully occupied lattice corre- 
sponds to 156.67mM). The time unit is fixed by equating 
each Monte Carlo sweep (MCS) to r = a? /QD = 4 ns, 
where a — 2.2nm, is the step size by which particles are 
moved each MCS and D is the diffusion coefficient in 
water obtained from the Stokes-Einstein relation for the 
assumed particle radius of l.lnm. 

In Figure 1 we show the phase diagram in different 
projections. It must be noted that both in the p, T and 
Pm,T projections, the coexistence and spinodals on the 
low density show a change in slope near Tf. In particular, 
the spinodal density in the p, T projection retraces to 
higher values at temperatures below Tf. 

The scenario described by our approximate mean field 
is confirmed by Monte Carlo simulations. We deter- 
mine the coexistence line (Fig. 2) by the histogram re- 
weighting grand canonical Monte Carlo technique and 
evaluation of the global free energy [l^, . We locate 
the spinodal lines by identifying chemical potential val- 
ues at which the configuration probability distribution 
changes from a bimodal to a single peak distribution. 
The non-monotonicity and the bend in the spinodal are 
reproduced in the Monte Carlo simulation. We note that 
the phase behavior we obtain straight-forwardly explains 
the presence of a critical concentration to aggregation, 
that has been seen in experiments [lH| . 

We now study the kinetics of transformation follow- 
ing a "quench" from an initial equilibrium phase, using 
a dynamic Monte Carlo simulation, where in addition to 
the state changing Metropolis moves, we also move par- 
ticles into neighbouring vacant sites with probability p. 
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FIG. 1: Mean-field phase diagram, panels (a) - (d) show pro- 
jections corresponding to p — T, /j. ~ T, pm — T and P — T 
respectively. The pseudo-transition temperature between na- 
tive and random coil states at low concentration is indicated 
in panel (a) by a cross. 



related to its diffusion coefhcient. We have chosen 8 qual- 
itatively different protocols, marked (l)-(8) in Fig. 2, to 
study the kinetics from a homogeneous U state ('folding' 
pathway) or a homogeneous F state ('unfolding' path- 
way), into the metastable ((1),(2);(3),(4)) and unstable 
((5),(6);(7),(8)) regions, for a temperature above Tf and 
one below Tf. The data reported are from simulations 
on a 64 X 64 X 64 lattice, with typically 150 independent 
runs. Figure 3 shows the aggregate size distribution and 
mean aggregate size of misfolded proteins for protocol 3, 
where we quench the system to a metastable state from 
the unfolded state. 

The interplay of diffusion, detachment-attachment, 
and state change from U/F M, results in multiple 
growth regimes and crossovers, which depend on the 
quench protocol. We will highlight those features that 
are generic to the aggregation dynamics in the pres- 
ence of competing energy minima. The first surprise 
is that the aggregate size distribution at early times 



is P(n,t) 



-3.5±0.05 



for small aggregates (Fig. 3), a 



power law (with an exponential cutoff) rather than an 
exponential distribution expected from detailed balance 
dynamics. The dynamics in the subspace of misfolded 
configurations, mimics the dynamics of an open system 
with sources and sinks, arising from state changes to and 
from U/F, for which power law distributions are expected 
[27I HI] . We leave the analytical derivation of this power 
law to a later study. Together with the robust power-law 
distribution, there is a finite n peak, indicating a large 
aggregate which grows with time. At later times, when 
the fraction of U/F proteins has reached steady state (no 



3 



(a) 



2000 



— 1000 











spinodal 










— • binodal 


3 


u 














7 


M ^ 
























1 


5, 


3 


i 

i 




2 


^1 





100 150 
p (mM) 



(b) 



1500 



1000 



500 




coexistence 
> spinodal 




100 



150 



200 



(a) 

5x10' 

P(n,t) 
5x10"' r 

5x10' 



-0-t 3950 
n n t = 4100 
V O t = 4300 




8.0x10' (b)lO° 
6.0x10' 



10 100 200 300 

n 




2.0x10" 



-0-t 


= 1x10' 


-0-t 


= 1.5x10' 


■O- 


= 2x10' 



4.0x10"' 
3.0x10' 
H 2.0x10" 
1.0x10"' 
0.0 



5 10 15 5000 1000015000 

n 





1 10 10 10 1 1 0" 

t (ms) 



10 10 10 10 10 10 

1 (ms) 



FIG. 3: (a) Early time aggregate size distribution P(n,t) dis- 
plays power law behavior for small aggregates, and an emerg- 
ing finite size peak indicating the onset of aggregation, (b) At 
late times, the distribution is exponential, and a large peak 
at large sizes is seen corresponding to the formation of large 
aggregates. Mean cluster size of misfolded proteins, vs. time 
(c) for protocol 1 (PI), (d) for protocol 3 (P3), showing the 
intermediate time plateau, and the power law growth phase. 



FIG. 2: Simulation phase diagram in (a) the p — T plane, and 
(b) the pm — T plane. The coexistence and spinodal lines 
have been obtained using the histogram reweighting tech- 
nique. Also indicated by arrows in (a) are protocols (1) - 
(8) by which the protein solution is either quenched down 
from the high temperature unfolded (U) phase (protocols 1, 
3, 5, 7), or heated up from the low temperature folded (F) 
phase (protocols 2,4,6,8), into metastable (protocols 1-4) 
(p = 15.67 mM) or unstable (protocols 5 - 8) (p = 78.35 mM) 
parts of the phase diagram. The final (p, T) values for these 
protocols are indicated by open circles. 



'source' ), P{n,t) goes over to the expected exponential 
distribution (Fig. 3 (b)), together with a gro"wing peak 
at large n. 

The dynamics of the mean aggregate size (n) = 
^„ nP(n, t)/ ^„ P(n, t) sho-ws multiple gro-wth regimes 
- at very early times the gro"wth is dominated by the 
conversion of isolated (or clusters of) U proteins into M; 
gro"wth via diffusion of M kicks in later. This is generi- 
cally follo"wed by a gro"wth plateau (-which becomes less 
clearly defined at high p, high T), "where the largest ag- 
gregate, "which can be as large as 30 monomers, does 
not gro"w appreciably. These intermediate structures are 
probably stabihsed by a cloud of U/F proteins shielding 
it. Such stable intermediates have been reported in re- 
cent studies of amyloid aggregation [l^. We note that 
a clear plateau is present "when "we study the system un- 
der metastable conditions, "whereas no clear pleateau is 



visible when the kinetics is observed in the unstable part 
of the phase diagram. This feature, and the observation 
of a spinodal line that is reentrant, and occurs as higher 
densities for lower temperatures (a special feature of the 
phase diagram we evaluate), can help explain the inter- 
esting kinetics seen in f29j . 

The late time growth depends on which dynamical 
mechanism - diffusion, detachment-attachment or state 
conversion - is dominant. Diffusion dominated growth 
(sot , likely at high T, low p, gives rise to a (n) ^ t or 
R ^ t^/'^, since aggregates are compact (Fig. 3c). On the 
other hand, the state conversion dynamics, which domi- 
nates at low T, leads to (n) ~ or i? ~ t (Fig. 3d). Fi- 
nally, detachment dominated dynamics (at high T, high 
p) should result in (n) - t^/"^ or i? ~ ^^/^[sij] (though 
this is hard to ascertain unambiguously from available 
numerical data). 

Figure 4 shows the onset times for the growth phase, 
T, defined as the time of departure from the intermedi- 
ate structure plateau for p — 15.67 mM. We quench from 
the high temperature, unfolded phase ("cooling"; with 
initial condition where all proteins are unfolded), and 
the low temperature, folded phase ("heating"; with ini- 
tial condition where all proteins are folded) respectively, 
to temperatures at which the system is in the metastable 
phase. While for high temperatures (above T = 
we see that the crossover times for heating and cooling 
runs are roughly the same, for low temperatures (below 
T = 270.8iir), the onset of the growth phase is substan- 



4 



(a) 300 




(b) 



4 









• • cooling (t^) 

O — O heating (t,,) 



740 745 750 755 760 

T(K) 



240 250 260 270 280 
T(K) 



740 745 750 755 760 
T(K) 



FIG. 4: Onset times for the growth phase: Inset shows 
vs T and Tc vs T. r is the MC step defined as the time of 
departure from the intermediate structure plateau in Fig. 3 
(c) to the growth regime. The subscript of r refers to the 
protocol (heating vs cooling), (b) Th/rc vs T, inset shows 
vs T and vs T. The density is p = 15.67 mM. Independent 
runs vary from 25 to 75 in each case. 



the competition between folding and aggregation of pro- 
teins using a phenomenological lattice model. There are 
many interesting extensions that we plan to explore in 
future. For instance, including attractive interactions be- 
tween UU and UM, would dramatically alter the nature 
of aggregates, such as producing small U aggregates, and 
aggregates containing mixtures of M and U. These mixed 
aggregates would be more flexible because the U inser- 
tions would provide flexible hinges. Another extension 
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tially delayed when we heat up from the folded phase, in- 
dicating the relative difficulty of nucleating the misfolded 
aggregate from a solution of folded proteins. The onset 
time of aggregation thus depends on the initial state of 
proteins in solution, a fact which must therefore be taken 
into account in interpreting experimental data. 

An instructive way of describing the results of the 
dynamics of transformation is by Time- Temperature- 
Transformation (TTT) curves, where each curve is a plot 
of the time required to obtain a fraction x when quenched 
to a temperature T, and may be viewed as a kinetic phase 
diagram. Fig. 5 shows the TTT curves for quenches from 
the high temperature U-phase, and from initial condi- 
tions in the low temperature F-phase. Between 25 and 
75 independent runs are performed at each temperature 
for a system of size 64 x 64 x 64. We note that in both 
the heating and cooling cases, there is a greater spread 
in times at the high temperature end for transformation 
fractions between 20% to 60%, as compared to the lower 
temperature range, where rapid transformation occurs 
following a longer lag time. Further, we note that when 
the system is heated from the low temperature F-phase, 
the transformation times at low temperatures are no- 
ticeably longer. A more detailed study of the various 
growth phases and the manner in which the competition 
between the global thermodynamic stability of the ag- 
gregate phase and the local stability of the folded state 
determine the kinetics and morphology of aggregation is 
under way. 

In this paper we have studied the thermodynamics of 



FIG. 5: (a) TTT (time-temperature transformation) plot 
for cooling protocol, (density = 15.67 mM) % = 100 x (no 
of M-proteins/ total proteins). All cooling protocols (ini- 
tialised with U-phase, quenched from high T) . (b) TTT (time- 
temperature transformation) plot for all heating protocols 
(density = 15.67 mM) ( initialsed with F phase, "quenched" 
from low T). 



is to include changes in configurational entropy and in- 
ternal energy of the M-state upon aggregation, a feature 
related to domain swapping. Including anisotropy in the 
inter-protein interactions would naturally give rise to lin- 
ear and 'sheet'-like aggregates. Most importantly, by in- 
troducing explicit intra-protein interactions to describe 
the U ^ F transition, we will be able to study the effect 
of aggregation on the dynamics of folding. Finally, the 
effect of charge interactions is expected to induce effec- 
tive anisotropy in the aggregate morphology [s^ Isslls^. 
and indeed, the role of charges in the formation of or- 
dered aggregates has been previously noted^S^, The 
approach presented here allows for these effects to be 
studied systematically. 
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